CLAIMS 



1 . A method to control and monitor a hybrid cooling system for cooling multiple 
logic modules with different heat loads to the same temperature while maintaining system 
clock speeds as fast as viable, the method comprising: 

cooling the multiple logic modules with a single refrigerant unit having a backup 
air cooling system; 

monitoring temperatures of any logic module subject to temperature changes; 

controlling a first PID loop of electronic expansion valves in fluid communication 
with a corresponding evaporator, each expansion valve controlling the temperature of a 
corresponding logic module operating, each logic module having a heat load cooled by at 
least one of the single refrigerant unit and the backup air cooling system; and 

controlling a second PE) loop of a compressor speed of the single refrigerant unit 
to extend refrigeration capacity and control for cooling multiple logic modules once an 
expansion valve has maximized a cooling capacity that the expansion valve can deliver. 

2. The method of claim 1 further comprising: 

controlling a blower speed for the airflow that cools a refrigerant condenser of the 
single refrigerant unit if a thermal sensor recording a tube temperature entering the 
condenser is different than a temperature of air exiting the condenser indicative of at least 
one of an improper and an unstable cooling condition. 

3. The method of claim 2, wherein the blower speed is increased when the tube 
temperature is cooler than the temperature of air exiting the condenser. 
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4. The method of claim 1 further comprising: 

monitoring the cooling state of each logic module of the multiple modules along 
with error registers to denote any cooling hardware failures of the single refrigerant unit 
not yet repaired. 

5. The method of claim 4, wherein the monitoring is done through redundant 
thermal sensors directly monitoring a region representative of circuit temperatures of a 
corresponding logic module. 

6. The method of claim 5, wherein the region corresponds with one of a hat, 

o 

substrate, and individual chips of a multi chip module (MCM). 

7. The method of claim 5, wherein the thermal sensors are compared for at least 
one of miscompare properties and insanity limits to check accuracy of each measured 
temperature. 

8. The method of claim 5, wherein the thermal sensors include a first thermal 
sensor sensed by the refrigerant unit and second and third thermal sensors read by a 
power supply supplying power to the multiple logic modules to insure at least one of full 
redundancy and accuracy. 

9. The method of claim 8, wherein the second and third thermal sensors are 
compared to each other and to the first thermal sensor versus miscompare limits, the 
second and third thermal sensors providing thermal protection of the multiple logic 
modules by dropping power if at least one of second and third thermal sensors indicate a 
temperature corresponding to a damage limit. 

10. The method of claim 1, wherein the first PE) loop control opens an 
electronically controlled expansion valve when a corresponding logic module is higher 
than targeted and closes the valve when cooler than targeted. 
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11. The method of claim 1, wherein the refrigerant unit includes a condenser, the 
electronically controlled expansion valves, the compressor, and a controller providing 
control signals to the expansion valves all contained within a modular refrigeration unit, 
and each expansion valve is proximate a corresponding evaporator in thermal 
communication with a respective logic module. 

12. The method of claim 11, wherein the hybrid cooling system includes liquid 
cooling provided by the refrigerant system and the backup air cooling provided by heat 
sink fins in thermal communication with each corresponding evaporator. 

13. A method to determine a proper clock cycle time for multiple logic modules 
with different heat loads while maintaining the clock cycle time as fast as viable, the 
method comprising: 

determining a thermal state of each logic module of the multiple logic modules, 
each thermal state defined by a discrete temperature range associated with a clock speed 
predetermined to be a proper clock cycle time for the temperature range; and 

determining whether a primary cooling means has been repaired. 

14. The method of claim 13 further comprising: 

turning on backup air cooling fan if a temperature of any of the multiple logic 
modules are above acceptable levels of cooling by the primary cooling means. 

15. The method of claim 14 further comprising 

controlling a fan speed of the backup cooling fan to prevent oscillation between 
thermal states. 
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16. The method of claim 13 further comprising: 

increasing a voltage applied to a logic module to optimally use at least one of 
available cooling and power when operating in a backup cooling mode for maximum 
clock speed at a given temperature. 

17. The method of claim 13 further comprising: 

decreasing a voltage applied to a logic module when operating in a backup 
cooling mode and at least one of cooling or power is unavailable to reduce leakage 
currents that warmer degraded temperatures generate. 

18. The method of claim 13, wherein said determining a thermal state of each 
logic module of the multiple logic modules is done through redundant thermal sensors 
directly monitoring a region representative of circuit temperatures of a corresponding 
logic module to provide at least one of thermal protection and redundancy to guide 
cooling control. 

19. The method of claim 18, wherein the region corresponds with one of a hat, 
substrate, and individual chips of a multi chip module (MCM). 

20. The method of claim 18, wherein the thermal sensors are compared for at least 
one of miscompare properties and insanity limits to check accuracy of each measured 
temperature. 

21. The method of claim 20, wherein the thermal sensors include a first thermal 
sensor sensed by the refrigerant unit and second and third thermal sensors read by a 
power supply supplying power to the multiple logic modules to insure at least one of full 
redundancy and accuracy. 
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22. The method of claim 21, wherein the second and third thermal sensors are 
compared to each other and to the first thermal sensor, the second and third thermal 
sensors providing thermal protection of the multiple logic modules by dropping power if 
at least one of second and third thermal sensors indicate a temperature corresponding to a 
damage limit. 

23. The method of claim 14 further comprising: 

operating the backup air cooling fans in a manner to insure that the fans always 
turn on even if the primary cooling means has failed; and 

operating the backup air cooling fans in a manner to insure that the fans do not 
come on so soon as to cause an oscillation of a cooling state when the primary cooling 
means has failed. 

24. A method to initialize the logic clocks for multiple logic modules in a fail-safe 
parallel manner, the method comprising: 

cooling the multiple logic modules with a hybrid cooling system, the hybrid 
cooling system includes a refrigerant unit as a primary cooling means and backup air 
cooling as a secondary cooling means; and 

issuing parallel "pre-cooling" commands to each logic module cooled by the 
refrigerant unit that allows the primary cooling means a head start in cooling prior to the 
logic clocks being turned on. 

25. The method of claim 24 further comprising: 

signaling that a pre-cooling temperature state has been reached; and 
initiating initial microcode load (ML). 
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26. The method of claim 25 further comprising: 

resetting an integral term in a PID controller when the logic clocks are sensed to 
have first come on to improve the refrigerant unit cooling accuracy. 

27. The method of claim 26 further comprising: 

setting and verifying current phase lock loop patterns before any change is 
attempted with respect to the logic clocks. 

28. The method of claim 27 further comprising: 

signaling any change in a cooling state of any of the multiple logic modules with 
interrupts. 

29. The method of claim 28 further comprising: 

reviewing the cooling state of each logic module of the multiple logic modules 
running in a given server; and 

adjusting the clock speed target based on a logic module having the most thermal 
degradation. 

30. The method of claim 29 further comprising: 

incrementing the clock speed using a two step method for phase lock loops (PLL) 
always remembering and verifying a current setting before initiating a new setting. 

31. The method of claim 30 further comprising: 

dividing an entire temperature operating range of the multiple logic modules from 
a normal operating temperature to a near hardware damage temperature at which the logic - 
modules are powered off forming continuous, programmable regions known as "cooling 
states, wherein within each cooling state exists a single "optimized" clock speed that is 
maintained while in a given cooling state. 
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32. The method of claim 31, wherein each temperature range defining a 
corresponding cooling state has hysteresis built in to prevent oscillation between clock 
speeds. 

33. The method of claim 30, wherein the incrementing includes incrementing 
several clocks effecting one processor system to move in a means that an optimum ratio 
of clock speeds is always maintained, 

34. The method of claim 33, wherein a minimum deviation from a known "ideal 
clock ratio" is always maintained between any two clocks. 

35. The method of claim 34 further comprising: 

using a maximum increment smaller than what could cause the PLLs to lose lock 
due to noise to at least one of an oscillator and to a chip having the PLL. 

36. The method of claim 35, wherein a total clock increment between two cooling 
states is achieved in a stepwise series of "pseudo-linear" increments of clocks. 

37. The method of claim 35 further comprising: 

changing the clocks of multiple clock boundaries with different oscillators and 
different frequencies always maintaining ideal clock ratios and minimum increments 
when suitable for best performance in a new temperature environment. 

38. The method of claim 31 further comprising: 

verifying at each system IML that the phase lock loops are capable of operating at 
every possible clock speed inside the range of operation. 

39. The method of claim 38 further comprising: 

using product cooling hardware including one of refrigerant, water, and air 
cooling to temperature bias stress test the logic modules prior to shipping. 
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40. The method of claim 35 wherein if a problem with the cooling system is 
present at system power on and the ML initialization stage, the IML happens in a 
degraded cooling state. 

41. The method of claim 40, wherein after the cooling problem is repaired, the 
server does not speed up to its fastest clock speed to limit risk to an elastic interface. 

42. The method of claim 30, further comprising: 

storing and updating clock data in a in SEEPROM for later use in clock 
adjustments. 

43. The method of claim 40 further comprising: 

alerting an operator when the system is operating in a given degraded cooling 
range; and 

notifying the operator not to re-ML until the cooling problem is serviced. 

44. The method of claim 42, further comprising: 

executing a repair and verify procedure configured to automatically remove error 
registers when the cooling problem has been repaired; and 

allowing the server to increase its clock speed until it reaches a clock speed at 
which it was IMLed. 
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